Attention for Fine-Grained Categorization

نویسندگان

  • Pierre Sermanet
  • Andrea Frome
  • Esteban Real
چکیده

This paper presents experiments extending the work of Ba et al. (2014) on recurrent neural models for attention into less constrained visual environments, beginning with fine-grained categorization on the Stanford Dogs data set. In this work we use an RNN of the same structure but substitute a more powerful visual network and perform large-scale pre-training of the visual network outside of the attention RNN. Most work in attention models to date focuses on tasks with toy or more constrained visual environments. We present competitive results for finegrained categorization. More importantly, we show that our model learns to direct high resolution attention to the most discriminative regions without any spatial supervision such as bounding boxes. Given a small input window, it is hence able to discriminate fine-grained dog breeds with cheap glances at faces and fur patterns, while avoiding expensive and distracting processing of entire images. In addition to allowing high resolution processing with a fixed budget and naturally handling static or sequential inputs, this approach has the major advantage of being trained end-to-end, unlike most current approaches which are heavily engineered.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-Grained Categorization for 3D Scene Understanding

Fine-grained categorization of object classes is receiving increased attention, since it promises to automate classification tasks that are difficult even for humans, such as the distinction between different animal species. In this paper, we consider fine-grained categorization for a different reason: following the intuition that fine-grained categories encode metric information, we aim to gen...

متن کامل

Visual Representations for Fine-grained Categorization

Visual Representations for Fine-grained Categorization

متن کامل

Integrating Randomization and Discrimination for Classifying Human-Object Interaction Activities

Psychologists have shown that the ability of humans to perform basic-level categorization (e.g. cars vs. dogs; kitchen vs. highway) develops well before their ability to perform subordinate-level categorization, or fine-grained visual categorization (e.g. distinguishing dog breeds such as Golden retrievers vs. Labradors) [18]. It is interesting to observe that computer vision research has follo...

متن کامل

Efficient Two-Step Middle-Level Part Feature Extraction for Fine-Grained Visual Categorization

Fine-grained visual categorization (FGVC) has drawn increasing attention as an emerging research field in recent years. In contrast to generic-domain visual recognition, FGVC is characterized by high intraclass and subtle inter-class variations. To distinguish conceptually and visually similar categories, highly discriminative visual features must be extracted. Moreover, FGVC has highly special...

متن کامل

Weakly Supervised Fine-Grained Image Categorization

In this paper, we categorize fine-grained images without using any object / part annotation neither in the training nor in the testing stage, a step towards making it suitable for deployments. Fine-grained image categorization aims to classify objects with subtle distinctions. Most existing works heavily rely on object / part detectors to build the correspondence between object parts by using o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1412.7054  شماره 

صفحات  -

تاریخ انتشار 2014